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Abstract 

We prove lower bounds on the number of product gates in bilinear and quadratic circuits that 
compute the product of two nxn matrices over finite fields. In particular we obtain the following 
results: 

1. We show that the number of product gates in any bilinear (or quadratic) circuit that com- 
putes the product of two nxn matrices over GF(2) is at least 3n 2 — o(n 2 ). 

2. We show that the number of product gates in any bilinear circuit that computes the product 
of two nxn matrices over GF(p) is at least (2.5 + )n 2 — o(n 2 ). 

These results improve the former results of [|[ [FJ who proved lower bounds of 2.5n 2 — o(n 2 ). 

1 Introduction 

The problem of computing the product of two matrices is one of the most studied computational 
problems: We are given two nxn matrices x = (xij), y = (yi,j), and we wish to compute their 
product, i.e. there are n 2 outputs where the (i, j)'th output is 

n 

(x ■ y)i,j = x ^ k ' y k >j ■ 

k=l 



In 69' Strassen surprised the world by showing an upper bound of (9(n log27 ) |l2[ . This bound 
was later improved and the best upper bound today is 0(n 2 376 ) || (see also |JJ for a survey). The 
best lower bound is a lower bounds of 2.5n 2 — o(n 2 ), on the number of products needed to compute 
the function 0, GJ. Thus the following problem is still open: Can matrix product be computed by a 
circuit of size 0(n 2 ) ? 

The standard computational model for computing polynomials is the model of arithmetic circuits, 
i.e. circuits over the base {+, •} over some field F. This is indeed the most general model, but for 
matrix product two other models are usually considered, quadratic circuits and bilinear circuits. In 
the quadratic model we require that product gates are applied only on two linear functions. In the 
bilinear model we also require that product gates are applied only on two linear functions, but in 
addition we require that the first linear function is linear in the variables of x and that the second 



linear function is linear in the variables of y. These models are more restricted than the general 
model of arithmetic circuits. However it is interesting to note that over infinite fields we can always 
assume w.l.o.g. that any circuit for matrix product is a quadratic circuit ]l3|. In addition we note 
that the best circuits that we have today for matrix product are bilinear circuits. 

In this paper we prove that any quadratic circuit that computes matrix product over the field 
GF{2) has at least 3n 2 — o(n 2 ) product gates, and that any bilinear circuit for matrix product over 
the field GF(p) must have at least (2.5 + -^zr[)n 2 — o(n 2 ) product gates. 

Prom now on we will use the notation MP n to denote the problem of computing the product of 
two n x n matrices. 

1.1 Known Lower Bounds 

In contrast to the major advances in proving upper bound, the attempts to prove lower bounds 
on the size of bilinear circuits that compute MP n were less successful. Denote by g*(MP„) and 
bl*{MP n ) the number of product gates in a smallest quadratic circuit for MP n , and in a smallest 
bilinear circuit for MP n respectively. We also denote by W[„t(MP n ) the total number of gates in a 
smallest bilinear circuit for MP n . In 78' Brocket and Dobkin proved that bl*(MP n ) > 2n 2 — 1 over 
any field [[!(]]. This lower bound was later generalized by Lafon and Winograd to a lower bound on 
q*(MP n ) over any field f§. In 89' Bshouty showed that over GF(2), q*{MP n ) > 2.5n 2 - O(nlogn) 
[]|]. Recently Blaser proved a lower bound of 2n 2 + n — 3 on q^{MP n ) over any field Q. In Blaser 
proved that bU(MP n ) > 2.5n 2 - 3n over any field. 

In H it is shown that any bounded depth circuit for MP n , over any field, has a super linear (in 
n 2 ) size. Notice however, that the best known circuits for MP n have depth $7(logn). 

1.2 Bilinear Rank 

An important notion that is highly related to the problem of computing matrix product in bilinear 
circuits is the notion of bilinear rank. 

A bilinear form in two sets of variables x, y is a polynomial in the variables of x and the variables 
of y, which is linear in the variables of x and linear in the variables of y. Clearly each output of 
MP n is a bilinear form in x = {xij}, y = {yij}- The bilinear rank of a set of bilinear forms 
{ bi(x,y), b m (x,y) } is the smallest number of rank 1 bilinear forms that span b±, b m , 

where a rank 1 bilinear form is a product of a linear form in the x variables and a linear form in the 
y variables. We denote by RfQ>i, • • • , b m ) the bilinear rank of { } over the field F. For 

further background see . 

We denote by Rp{MP n ) the bilinear rank over F of the n 2 outputs of matrix product, i.e. it is 
the bilinear rank of the set {J2k=i x i,k ' Vk,j}i,j over F- 
The following inequalities are obvious (over any field). 

• q*(MP n ) < bh{MP n ) < 2q,{MP n ). 

• R F {MP n ) = bh{MP n ). 

• The following inequality is less obvious, but also not so hard to see. 

bh(MP n ) < bl tot (MP n ) < poly(logn) • bh(MP n ) . 
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I.e. up to polylogarithmic factors, the number of product gates in a smallest bilinear circuit 
for MP n , over any field F, is equal to the total number of gates in the circuit. 

1.3 Results and Methods 

We prove that any quadratic circuit that computes MP n over the field GF{2) has at least 3n 2 — o(n 2 ) 
product gates (i.e. q*(MP n ) > 3n 2 — o(n 2 ) over GF{2)). We also prove that over the field GF(p) 
every bilinear circuit for MP n must have at least (2.5+ p 3 j 1 )n 2 — o(n 2 ) product gates (i.e. bl*(MP n ) > 

(2.5 + Ji'^^ n 2 — o{n 2 ) over GF(p)). Both of these results actually hold for the bilinear rank as well. 

The proof of the lower bound over GF{2) is based on techniques from the theory of linear codes. 
However, we cannot use known results from coding theory in a straightforward way, since we are not 
dealing with codes in which every two words are distant, but rather with codes on matrices in which 
the distance between two code words, of two matrices, is proportional to the rank of the difference of 
the matrices. The reduction from circuits to codes and the proof of the bound are given in section ||. 

The proof of the second bound is based on a lemma proved by Blaser in Q. We prove that in the 
case of finite fields we can use the lemma with better parameters than those used by Blaser. This 
result is proved in section ||. 

1.4 Organization of the paper 

In section [2] we present the models of bilinear circuits and quadratic circuits. In section [|| we present 
some algebraic and combinatorial tools that we need for the proofs of our lower bounds. 

In section |] we introduce the notion of linear codes of matrices, and prove our lower bound on 
bilinear and quadratic circuits that compute MP n over GF{2). In section | we prove our lower 
bound on bilinear circuits that compute MP n over GF(p). 

2 Arithmetic Models 

In this section we present the models of quadratic circuits and bilinear circuits. These are the models 
for which we prove our lower bounds. We first give the definition of a general arithmetic circuit. An 
arithmetic circuit over a field F is a directed acyclic graph as follows. Nodes of in-degree are called 
inputs and are labeled with input variables. Nodes of out-degree are called outputs. Each edge 
is labeled with a constant from the field and each node other than an input is labeled with one of 
the following operations { + , • }, in the first case the node is a plus gate and in the second case a 
product gate. The computation is done in the following way. An input just computes the value of 
the variable that labels it. Then, if vi, . . . , Vk are the vertices that fan into v then we multiply the 
result of each Vi with the value of the edge that connects it to v. If v is a plus gate we sum all the 
results, otherwise v is a product gate and we multiply all the results. Obviously the value computed 
by each node in the circuit is a polynomial over F in the input variables. 

We are interested in the problem of computing the product of two n x n matrices, MP n . The 
input consists of two n x n matrices x, y. The output is the matrix x ■ y, i.e., there are n 2 outputs, 
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and the («, j)'th output is: 

n 
k=l 



( x ■ y)i,j = £ %i,k ' Vk,j 



Each output (x • y)%j is hence a bilinear form in x and y. 

Since each output of MP n is a bilinear form, it is natural to consider bilinear arithmetic circuits 
for it. A bilinear arithmetic circuit is an arithmetic circuit with the additional restriction that 
product gates are applied only on two linear functions, one function is linear in the variables of 
x and the other function is linear in the variables of y. Thus, bilinear circuits have the following 
structure. First, there are many plus gates computing linear forms in x and linear forms in y. Then 
there is one level of product gates that compute bilinear forms, and finally there are many plus gates 
that eventually compute the outputs. We will be interested in bounding from below the number of 
products in any bilinear circuit for MP n . This model is more restricted than the general model of 
arithmetic circuits but we note that all the known upper bounds (over any field) for MP n are by 
bilinear circuits. 

Another model that we will consider is the model of quadratic circuits. A quadratic circuit is an 
arithmetic circuit with the additional restriction that product gates are applied only on two linear 
functions. Notice that the only difference between quadratic circuits and bilinear circuits is that 
in the quadratic model the product gates compute quadratic forms in x, y, whereas in the bilinear 
model the product gates compute bilinear forms in x, y. This model is more general than the model 
of bilinear circuits, but it is still more restricted than the general model. However it is interesting 
to note that over infinite fields we can assume w.l.o.g. that any arithmetic circuit for MP n is a 
quadratic circuit [13]. 



3 Algebraic and Combinatorial tools 

In this section we present some algebraic and combinatorial tools that we will use. 

The following lemma is an extremely weak variant of the famous Schwartz-Zippel lemma which 
shows that every non zero polynomial (non zero as a formal expression) over a large enough field has 
a non zero assignment in the field (see [|ll], |l~5||). 

Lemma 1 Let P be a polynomial of degree d in xi, . . . , x n over some field F , such that d < \F\, 
and such that at least one of the coefficients of P is not zero. Then we can find an assignment, 
p £ F n , to the Xi 's, such that P(pi, ■ ■ ■ , p n ) / 0. 

We say that two polynomials p, q in n variables are equivalent over a field F, if p(x%, . . . , x n ) = 
q(x\, . . . , x n ) for any x\, . . . , x n G F. We denote p = q if p and q are equivalent over F (we omit F 
from the notation as the field that we deal with will be clear from the context). 

Lemma 2 Let P be a polynomial of degree d in the variables x±, . . . , x n over a field F. If P ^ 
then we can find an assignment, p S F n , to the X{ 's such that at most d of the pi 's get a nonzero 
value, and such that P(pi, ■ ■ ■ , p n ) ^ 0. 

Proof: P is equal (as a function) to a polynomial P in which the degree of each variable is at most 
\F\ — 1. We call P the reduction of P. Consider some monomial M in P whose coefficient is not zero. 
We assign all the variables that do not appear in M to zero. The resulting polynomial (after the 
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assignment), is a polynomial in the variables of M, which is not the zero polynomial as it is a reduced 
polynomial which has a monomial with a non zero coefficient (M of course). Therefore according to 
lemma |l| there is some assignment to the variables of M, that gives this polynomial a nonzero value. 
Therefor we have found an assignment which gives nonzero values only to the variables of M (and 
there are at most d such variables) under which 4k 

The following useful lemma, which is a straightforward implication of the previous lemma, is the 
key lemma in most of our proofs. The lemma deals with linear forms in n 2 variables. From now on 
we shall think about such linear forms as linear forms in the entries ofnxu matrices. 

Lemma 3 Let p\, . . . , p n 2 be n 2 linearly independent linear forms in n 2 variables over some field 
F. Let P be a polynomial of degree d in kn 2 variables over F, i.e. we can view P as a polynomial 
P(xi, ...,££;) in the entries of k matrices, x\, ■■■,Xj t , of size nxn each. Assume that P ^ 0. Then we 
can find k matrices a±, . . . , a& G M n (F) such that P(oi, «&) ^ and such that there exist n 2 — d 
linear forms among pi, p n 2 's that vanish on all the a. L 's. 

Proof: The idea of the proof is the following. Let b±, . . . , b n 2 be the dual basis of pi, ...,p n 2, i.e. 
it is a basis of M n {F) satisfying Vi,j Pi(bj) = Sij. We wish to find k matrices, a\, ...,a,fc, such that 
P(ai, ...,afc) ^ 0, and such that there exist b^, ■■■,bi d that span all of them. If we manage to find 
such matrices, then since the frj's are the dual basis to the p^s we will get that n 2 — d of the p^s 
vanish on a±, ...,a,k- The way to find such matrices that are contained in the span of a small subset 
of the Vs, is based on lemma ||. 

So let &i, . . . , b n 2 be the dual basis to pi, . . . , p n 2, i.e. Vi, j Pi(bj) = 5ij. We now change the 
variables of P. Let ctij j = l...k, i = l...n 2 , be a set of kn 2 variables. Denote Xj = Ysi=i a i,jbi. 
Thus P(xi, ...,Xfc) can be viewed as a polynomial of degree d in the kn 2 variables a^j. Therefore 
P ^ as a polynomial in the ajj's. Hence, according to lemma ^ there exists an assignment, p, 
to the Qij's such that at most d of them get a nonzero value. Define aj = Yli=l PiJ^i- Clearly 
P(«i, • • • ,ak) 7^ 0. Since at most d of the Pijs got non zero values, we see that there are at most 
d &j's such that all the a/s are linear combinations of them. Since the frj's are the dual basis to 
Hi, . . . , p n 2 we get that there are at least n 2 — d of the p^s that vanish on all the Oj's. Therefore 
ai, . . . , Ofc satisfy the requirements of the lemma. 4» 

The next lemma will enable us to translate properties of matrices over large fields of characteristic 
p to properties of matrices (of higher dimension) over GF{p). 

Lemma 4 There exist an embedding, <p '■ GF(p n ) <^-> M n {GF(p)). That is there exist a mapping 
<p : GF(p n ) ^ M n (GF(p)) such that 

• (j) is a one to one linear transformation. 

• <j)(l) = I, where L is the nxn identity matrix. 

• (p is multiplicative, i.e. Vx,y € GF{p n ) we have that 4>(xy) = (f>(x) ■ (ft(y). 

This embedding also induces an embedding Mk{GF(p n )) M n k{GF(p)). 

This lemma is a standard tool in algebra, but for completeness we give the proof. 
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Proof: GF(p n ) is an n dimensional vector space over GF(p). Each element x £ GF{p n ) can be 
viewed as a linear transformation x : GF(p n ) h- > GF(p n ) in the following way: 



Clearly this is a linear transformation of GF(p n ) into itself, as a vector space over GF(p). Therefore, 
by picking a basis to GF(p n ) we can represent the linear transformation corresponding to each x € 
GF(p n ) by a matrix a x € M n (GF(p)). Thus, we have defined a mapping : GF{p n ) i— > M n (GF(p)) 
such that ^>(x) = a x , and it is easy to verify that this mapping is an embedding of GF(p n ) into 
M n (GF(p)). The way to generalize it to an embedding of Mk(GF(p n )) into M n k{GF{p)) is the 
following. Let a = (aj.j) G M^{GF{p n )) be some matrix. Every entree of Ojj of a, is some element of 
GF(p n ). We can now replace dj j with the matrix 0(ajj). Thus the resulting matrix will be a fen x fcn 
matrix whose entries are in GF(p). Again it is easy to verify that this is indeed an embedding of 



In addition to the algebraic lemmas we also need the following combinatorial tools. 

Definition 1 Let F be a field, and let v, u be two vectors in F m . We denote by weight(u) the 
number of nonzero coordinates of v. Let dn(v,u) = weighty — u), i.e. &si{v,u) is the number 
of coordinates on which u and v differ. dn(v,u) is also known as the Hamming distance of u 
and v. We also denote by agree(u, v) the number of coordinates on which u and v are equal, i.e. 
agree(u, v) = m — dn(v, u). 

The next lemma shows that if a vector space contains a set of vectors such that every pair/triplet 
of them don't agree on many coordinates (i.e. their Hamming distance is large) then it is of large 
dimension. There are numerous similar lemmas in coding theory, and in particular the first part of 
our lemma is the famous Plotkin bound (see []l4|| ). 

Lemma 5 1. Ln every set of k vectors in GF{p) t , such that p < k, there are two vectors that 
agree on at least (| — |) coordinates. 

2. In every set of k vectors in GF{p) 1 , such that 2p < k, there are three vectors that agree on at 
least — ||) coordinates. 

Proof: We begin by proving the first claim. Let v\, . . . , Vk be k vectors in GF{p) t . We are going to 
estimate Yli<j a S ree ( v ii v j) i n two different ways. On the one hand this sum is at most (2) times the 
maximum of agree (vi, Vj). On the other hand consider a certain coordinate. For every a € GF(P) 
denote by n a the number of vectors among the v^s that are equal to a on this coordinate. Clearly 
Z)a=o n " = k- The contribution of this coordinate to J2i<j a g ree (^i> Vj) is exactly X)o=o (IT) - By 



Vy G GF(p n ) x(y) = x ■ y . 



M k {GF{p n )) into M nk (GF(p)). 



convexity 




We get that 
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Therefore 

tk-ptk-p t t 
max(agree(ui,u,-)) > - • > : — = . 

i<0 p k — \ p k p k 

The proof of the second claim is similar. We give two different estimates to J2i<j<i a g ree ( w i> v j, v i) 
(the number of coordinates on which vi, Vj, and vi are the same). In the same manner as before we 
get that 

t k — p k — 2p t 3t 
max (agree {vi,Vj,vi)) > — ■ • > — - — . 

Kj<i p z k — 1 k — 2 p z pk 



Corollary 1 If {0, 1}* contains k vectors v\, . . . , Vk, such that 2 < k and ^ j djj_(vi,Vj) > N, 



then t>2N- 4^. 

Proof: According to lemma [| there are two vectors, w.l.o.g. V\ and V2, such that agree(ui,U2) > 
| — |- Since dn(vi,V2) = t — agree(ui, v%) we get that 

i-(^-~)>d H (vi,V2)>iV 
and the result follows. ft 



4 Lower bound over GF(2) 

In this section we prove our main theorems. 

Theorem 1 bh(MP n ) > 3n 2 - O(nl) (in other words J R GF(2 )(MP n ) > 3n 2 - 0{n%)). 

The second theorem that we shall prove is a lower bound for quadratic circuits. 

Theorem 2 g,(M? n ) > 3n 2 — 0{nz). I.e. the number of product gates in any quadratic circuit that 
computes the product of two n x n matrices over GF(2) is at least 3n 2 — 0{ns). 

Clearly theorem |2| imply theorem |], but we first prove of theorem |l| as it is more intuitive and 
simple. We begin by introducing the notion of linear codes of matrices. 

4.1 Linear Codes of Matrices 

Definition 2 A linear code of matrices is a mapping, 

r:M n (GF(2))^{0,l} m , 
(for some m ) with the following properties: 

• r is linear. 

• For any matrix a, weight (r(a)) > n ■ rank(a). 

Prom the linearity of T and the requirement on weight (T (a)) we get the following corollary. 
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Corollary 2 T is a one to one mapping, and for any two matrices a and b, dn(T(a),T(b)) > 
n ■ rank(a — b). 

The following theorem shows that the dimension of the range of any linear code of matrices is 
large (i.e. m must be large). 

Theorem 3 Let T : M n (GF(2)) \— * {0, l} m be a linear code of matrices, then m > 3n 2 — 0(nz) . 
Proof: Denote 

T(a) = ( /ii(o), . . . , Hm{a) ) ■ 

The proof is based on the following lemma that shows that we can find k = ns matrices, a\, . . . , a& G 
M n (GF(2)), with the following properties. 

• Vi/j, Oi — aj is an invertible matrix. 

• There are n 2 — fyn linear forms among the /Vs that vanish on all the a^s. 

We state the lemma for every k < 2 n but we apply it only to k = res. 

Lemma 6 For every re, k such that k < 2 n , and any fix, . . . , fj, n 2 linearly independent linear forms 
in n 2 variables, over GF{2), there are k matrices, ax, . . . , £ M n (GF(2)), such that for every i 7= j, 
di — aj is an invertible matrix, and such that n 2 — („) n of the Hi 's vanish on them. 

Proof: Consider the following polynomial P in k matrices: 

P(ax, . . . , dfc) = determinant TT(cti — aj) 

\i<3 

Clearly a set of k matrices a±, . . . ,a& satisfy P(a\, . . . , afc) 7^ iff all the matrices ai — aj are invertible. 
In addition, it is easy to see that deg(P) = (^)n. Therefore if we show that P ^ over GF(2), then 
according to lemma || we will get what we wanted to prove. 

In order to show that P ^ we just have to prove the existence of k matrices, such that the 
difference of every two of them is invertible. Lemma |] assures us that we can embed the field GF{2 n ) 
into M n (GF(2)). Denote this embedding by $ : GF(2 n ) ^ M n (GF(2)). We take k distinct elements 
in GF(2 n ), x\, . . . , x^. Their images, $(x\), . . . , $(xf.), are matrices in M n (GF{2)) such that the 
difference of every two of them, $(xi) — $(xj) = — Xj), is an invertible matrix. This is because 
the Xj's are distinct (i.e. X{ — Xj ^ 0), and every nonzero element in GF{2 n ) is invertible. Thus, 
$(xi), . . . , $(xfc) are exactly the k matrices that we were looking for. This concludes the proof of 
the lemma. 4k 

We proceed with the proof of the theorem. Let k = n~3 . Since T is a one to one mapping, there 
are n 2 independent linear forms among . . . , /j, m . Therefore we can use lemma ^ and get that 
there are k matrices ai, . . . , such that for every i 7^ j ai — a,j is invertible, and such that, w.l.o.g., 
^i m _ r +i, . . . , fj, m vanish on a\, . . . , for some r > n 2 — i^)n > n 2 — ns. 

Since the last r linear forms vanish on all the ai's, we are going to restrict our attention only 
to the first m — r linear forms. So from now on we only consider r(aj) restricted to its first m — r 
coordinates. 
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Since each of the differences, aj — a,j (Vi ^ j), is an invertible matrix, we get that 
dH (r(cii), r(aj)) > n 2 . Thus, r(ai), r(afc) are vectors contained in {0, l} m_r (we con- 

sider only their first m — r coordinates !) such that the hamming distance of every pair of them is 
at least n 2 . Therefore according to corollary [l] we get that 

2 n2 
m — r > 2n — 4- . 

k + 2 

Since r > n 2 — n 3 and k = n3 , we get that 

m > 3n 2 — 0(n3 ) 

which is what we wanted to prove. This concludes the proof of the theorem. 4> 



4.2 Proof of Theorem [j] 

Assume that bl*(MP n ) = m. Let C be a smallest bilinear circuit for MP n . Let 

Ml 0*0 ' »?1 ■ • • , Mm (a?) • »7m(y) 

be the m bilinear forms computed in the product gates of C. We will show that these bilinear forms 
define in a very natural way a code on M n {GF{2)). The code thus defined, will have the property 
that the dimension of the space into which the code maps M n {GF{2)) is exactly m. Thus, according 
to theorem ^ we will get that m > 3n 2 — 0(n3), which is what we wanted to prove. 

So we begin by defining a mapping from M n (GF{2)) to {0, l} m . Let T : M n {GF{2)) h-> {0, l} m 
be the following mapping. 

T(x) = {m(x), Hm{x)) . 

Notice that we ignore the rji's in the definition of T. The next lemma shows that T is a linear code 
of matrices. 

Lemma 7 T is a linear transformation with the property that for every matrix x £ M n (GF(2)), 
weight (r(x)) > n ■ rank(x). 

Proof: Clearly T is a linear transformation from M n (GF(2)) to {0, l} m . So we only have to prove the 
claim about the weights. Let x be a matrix of rank r. Assume w.l.o.g. that H\(x) = . . . = fJ,k(x) = 1 
and that Hk+i(x) = ... = fjL m (x) = 0, i.e. weight(r(x)) = k. We shall show that k > nr. For 
every y € M n (GF(2)), the n 2 entries of x ■ y are functions of H\(x) ■ r]i(y), . . . , fJ, m (x) ■ i] m (y). Since 
Mfc+i( x ) = • • • = Mm( x ) = 0) we that x ■ y is a function of r/i(y), • • • , rjk(y)- Therefore there are 
at most 2 k different matrices of the form x ■ y. Since rank(x) = r we get that there are exactly 2 nr 
different matrices of the form x ■ y. Therefore k > nr. This concludes the proof of the lemma. $ 

Therefore T is a linear code of matrices, so according to theorem |3| we get that m > 3n 2 — 0(n~3 ) 
which is what we wanted to prove. This concludes the proof of theorem |l[ Jf» 
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4.3 Proof of Theorem || 



As in the proof of theorem [T] we will show that every quadratic circuit for MP n , defines a code on 
M n (GF(2)). The code thus defined, will have the property that m (i.e the dimension of the space 
into which the code maps M n {GF{2))) is exactly the number of product gates in the circuit. Thus, 
according to theorem ^ we will get that m > 3n 2 — O(na), which is what we wanted to prove. 

Let C be a quadratic circuit for MP n . Assume that the product gates of C compute the quadratic 
forms ni(x,y) • rji(x,y), . . . , fj, m (x,y) ■ rj m (x,y). Thus, each of the outputs (x ■ y)ij can be written 
as a sum of these quadratic forms: 

m 

( x • y)i,j = Y a iJ " Vk(x, y) ■ Vk(x, y) , 
k=l 

where afj G {0, 1}. 

We would like to have a proof similar to the proof of theorem |l|. In that proof we defined a code 
of matrices using the linear transformation fix, ...,fj, m . Unfortunately this method will fail here as fa 
is a linear function in both the variables of x and the variables of y and not just in the variables of x 
as in the proof of theorem [l|. In order to overcome this obstacle we introduce a new set of variables 
z = {zij}ij=i,„ n . We think about z as an n X n matrix. Define the following m linear forms in z: 

lk{z) = a id z iJ ' k = l,...,m . 

We get that 

m 

Vk(x, y) ■ m( x , v) ■ ik{z) = 

k=l 

m 

k=l i,j 

m 

Yl Y a i k j ■ Vk{x, y) ■ rj k (x, y) = (1) 

i,j k=l 

^2 z i,j ■ ( x • y)i,j = trace(a; • y ■ z l ) , 

i,i 

where (z t )ij = Zj^. The computation that we just performed shows that the 7/%'s that we introduced 
are quite natural. We also notice that z plays the same role in trace (x ■ y ■ z l ) as x and y. These 
observations motivate us to try to repeat the proof of theorem |l| using the 7fc's instead of the /Vs. 

So define a linear mapping T : M n (GF(2)) h-> {0, l} m by 

r (^) = (7l(2), •••> 7m0)) • 
The following lemma shows that V is indeed a linear code of matrices. 

Lemma 8 V is a linear mapping and it has the property that for every matrix z, weight (r (z)) > 
n ■ rank(z). 



10 



Proof: Clearly T is a linear mapping. So we only have to prove the claim about the weights. Let 
zo be a matrix of rank r, and assume w.l.o.g. that 71(20) = • • • = 7fc( 2: o) = 1 and 7/0+1(20) = . . . = 
7m(zo) = 0. We wish to prove that k > nr. From equation |l| we get that 

k 

trace(rr • y ■ zq) = ^ m{x, y) ■ rji(x, y) . 
i=i 

We now consider the discrete derivatives of this equation. Let j be the matrix of all zeros but 1 in 
the (i,j)'th place. Define 

^ i / t\ def 

trace(x • y ■ zq ) = 



trace((a; + e^j) • y ■ z$) — trace(x • y ■ zq) . 

On the one hand 

trace((x + e{j) ■ y ■ zq) — trace(x • y ■ z$) = 
trace(eij • y ■ zq 1 ) = (z ■ y*)^- . 

On the other hand we have that 

trace((x + ejj) • y ■ zq 1 ) — trace(j; • y ■ z$) = 

k 

^2(m(x + e i:j ,y) ■ rji(x + e i:j ,y) - a^x, y) ■ r)i(x, y)) = 

k 

^2(Hi(eij,0) -r}i(x, y) • rji(e i:j , 0))+ (2) 



i=l 

k 



i=l 



^^(e^O) • T]i(eij,0) , 



i=l 

where the last equality follows from the linearity of the fii's and the r/j's. Since (zq ■ y t )ij is a linear 
form in y, we actually get that 

Oo • y\j = j^— trace(x • y • Zq) = 

k 
i=l 

C span(^(x,y), rji(x,y)) 

(since (zq ■ y l )ij is a linear form the third summand of equation ^ sums to 0). In the same manner 
we define 

® j. / t\ def 

trace(x • y ■ z l ) = 

oyi,j 

trace(x • (y + aj) ■ zq) — trace(x • y ■ zq) . 

We get that 

(x* • z )i tj = trace(x • y • z ') = 
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C span(^(x,y), rji(x,y)) . 
Denote by PD the set of all the discrete partial derivatives 

| d , .. d , 
' -traced • y • zq ), — trace(x ■ y ■ zq ) 



1,3 



{ dx i,3 dVi,j 

We just proved that PD is contained in the linear span of 

{ ^i{x,y), Vi{x,y) }i=i , 

in the vector space of all linear forms in x, y. Therefore 

dim(span(PD)) < 

< dim(span{ ^(x,y), r)i(x,y) }f =1 ) < 2k . (3) 



We also showed that 



d , ts d , t . 

-trace(x • y ■ zq ), — trace(x • y ■ zq J 



h3 



{ dx i,3 9 Vi,3 

= {(«*• z o)i,j, Oo ■ y\j }.. ■ 

Therefor, using our assumption that rank(zo) = r, we get that 

dim(span(PD)) = 

= dim(span { {x l ■ zo)i,j, (zo ■ y l )i,j \ ) = 2nr . (4) 

1 ' *,3 

Combining equations |3| and [| we get that 2k > 2nr. j» 
Theorem ^ now follows from applying theorem ||| on the linear code of matrices T. A 

5 Other Finite Fields 

In this section we prove the following theorem. 

Theorem 4 The number of product gates in any bilinear circuit that computes the product of two 
nxn matrices overGF(p) is at least (2.5 + ^rr[)n 2 — 0(ni) (i.e. bl 1r {MP n ) > (2.5 + ^3j)n 2 — 0(ni) 
over GF(p) ). 

Let C be a bilinear circuit for MP n over GF(p). Assume that ^\{x) ■ r/i(y), . . . , f3 m (x) ■ rj m (y) 
are the bilinear forms computed in the product gates of C. The following lemma of Blaser is the 
main tool in the proof of the theorem. 
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Lemma 9 ftfy Let [a, b] = ab — ba. If there are two matrices a, b such that [a, b] is an invertible 
matrix, and such that there are t linear forms among fix, . . . , fi m such that each of them vanish on 
I, a, b then 
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m > t + 1.5n 



1)^2 

fix, . . . , fi m vanish on I, a, b , and such that [a, b] is invertible. 



We are going to prove that we can find a, b such that (1 — ^s)n 2 + ^ — 0(n*) linear forms among 



Proof of Theorem ^: We begin by proving that (wd.o.g.) many of the fi^s vanish on /. The 
following lemma shows that we can always find an invertible matrix such that many of the fi^s 
vanish on it. As before we assume that fix, . . . , fi n 2 are independent linear forms. 

Lemma 10 There exists an invertible matrix c, such that at least (1 — ^)n 2 + — 0(nz) of the fi{ 's 
vanish on it, where n 2 — 0(nz) of the fij, 's that vanish on it are among fix, ■ ■ ■ , fi n 2. 

Proof: An analog of lemma ^ over GF(p) guarantees that we can find k = nz matrices, ax, ■ ■ ■ , £ 
M n (GF(p)), such that for i ^ j a, — aj is invertible, and such that n 2 — fyn linear forms among 
fix, • • • , fi n 2 vanish on all of them. Denote r = n 2 — ( 2 )ti- And assume w.l.o.g. that fix, ■ ■ ■ , fi r 
vanish on all the dj's. 

Let us consider the following k vectors in GF(p) r 



\m—r . 



r(ai) d = (fi r+1 {ai), . . .,fi m (fli)) , i = 1. . . k . 

As in the proof of theorem |l|, we get that since Vi ^ j a« — aj is an invertible matrix, then 
dH(r(cti), r(oj)) > n 2 . According to lemma [|, two of these vectors agree on at least — 
coordinates. Assume that T(ax) and T^a^) are these vectors. Denote c = ax — ai- We have that c is 
an invertible matrix, such that the first r = n 2 — ( 2 )n linear forms (which are independent) vanish on 
it, and such that all the linear forms that r(ai) and r(a2) agree on, vanish on it as well. Therefore 
there are at least 

m — r m — r 
r H — 

p k 

linear forms that vanish on c. Since r = n 2 — (^)n, and k = n3, we get that at least 

(1 )n 2 H 0(ns) 

p p 

linear forms vanish on c, n 2 — O(n^) of them are among fix, ■ ■ ■ , fJ- n 2 ( we assume for simplicity that 
n 2 < m < 10n 2 , as it will not change the results). This completes the proof of the lemma. $ 

The lemma doesn't tell us who c is, but using the sandwiching method we can assume that c = I: 
We know that x ■ y is computed using the bilinear forms 

MlO) • Vi(y), Hm(x) ■ rf m {y) . 

We now do the following trick: x ■ y = (x ■ c) ■ (c~ l ■ y), therefore x ■ y can be computed using the 
bilinear forms 

MlO) • rjx{y), fi m (x) ■ rf m (y) , 
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where 



fii(x) d = m(x ■ c) and rfi(y) d = r]i(c 1 • y) 



Thus, if Hi(c) = then we get that /Ji(I) = pn{I ■ c) = 0. This trick is called sandwiching, for further 
background see (l|, ||. 



So by combining the sandwiching method and lemma 10 we get that we can assume w.l.o.g. that 



(1 — j } )n 2 + — — O(n^) of the /Vs, where n 2 — O(na) of them are among p,i, . . . , p n 2, vanish on /. 
The next lemma now assures us that we can find two matrices o, b that satisfy the requirements of 
lemma |9| 

Lemma 11 There are two matrices a, b such that [a, b] is an invertible matrix and such that at least 
(1 — ^)n 2 + Tf — 0(n*) of the pi 's vanish on I, a, b. 



i 



Proof: The proof of this lemma is similar to the proof of lemma 10. Let k = n*. The following 
lemma shows that we can find k matrices such that many of the /jj's vanish on all of them and such 
that among their differences there are matrices satisfying the requirements of lemma [|. 

Lemma 12 For every n,k, such that p 2 > 4(g), and any fix, p n 2 linearly independent linear 
forms, in n 2 variables, over GF{p), there are k matrices, ai, a^, such that Vi < j < I, 

[ai — aj, Oj — ai] is invertible, and such that n 2 — 2(g) n of the fii 's vanish on all the 's. 

Proof: Again we use lemma ||. Let P be the following polynomial. 

P(ai, ...,ak) = determinant [a, — ai,aj — ai] 

\i<j<l 

Clearly deg(P) = 2(3)72 (as a polynomial in the entries of the a^s). Therefore if we will prove that 
P ^ 0, i.e. that there exist k matrices on which P is not zero, then according to lemma [3| we are 
done. This is guaranteed by the following lemma. 

Lemma 13 Ifp? > 4(^) then there exist k matrices in M n (GF(p)), a±, . . . , such that^i < j < I, 
[ai — ai,aj — ai] is invertible. 

We prove the lemma only for n even. Clearly this will not affect theorem ||, as the lower bound 
for odd n follows from the lower bound for even n. 



Proof: Consider the following polynomial in 4k variables (i.e. it is a polynomial in k matrices over 
M 2 (GF(p))! ). 

Q(x\, ...,Xk) = determinant U [ a « — a,i,o,j — ai] 

\i<j<l 

Q is a polynomial of degree d = 4( 3 ) over GF(p), in the entries of the o^'s. Clearly Q is not the 
zero polynomial (as it is a product of non zero polynomials). Consider the field F = GF{p~5). Since 
d < \F\ we get by lemma [j] that there are k matrices p±, ...,Pk €E M%(F) such that Q(pi, ...,Pfc) 7^ 0. 
That is, \/i < j < I [pi — pi , pj — pi] is an invertible matrix. According to lemma || we can embed M2 (F) 
in M n (GF(p)). Therefore there are k matrices in M n (GF(p)) satisfying V« < j < I [ai — ai,a,j — a{\ 
is an invertible matrix, which is what we wanted to prove. 4k 
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This concludes the proof of lemma 12 



We proceed with the proof of lemma 11. We now restrict our attention to the linear forms 



among /i n 2 +1 , . . . , \i m that vanish on /. We shall prove that three of the matrices guaranteed by 



lemma 12 agree on many of these linear forms (more formally on 0(n*) of them). Thus, if 

ai, a,2, 03 are these three matrices, then we get that (1 — \)n 2 + ™ — 0(77,4) linear forms vanish 
on I, (a\ — 03), (02 — 03), and that [(ai — 03), (02 — 03)] is an invertible matrix, which is what we 
wanted to prove. 

So assume w.l.o.g. that the linear forms /U n 2 +1 , . . . , /7, n 2 +r , vanish on I (beside those among 

2 5 

Hi, . . . , /j n 2 that vanish on it) where r > m= ^ L — 0(773 ). Let a%, . . . , be the matrices guaranteed 



by lemma 12. Consider the following vectors: VI < i < k , 

Vi = (At n 2 + i(ai), . . . , /x n 2 +r (ai)) e GF(p) r . 

According to lemma [5] three of these vectors, namely v\, V2, W3, agree on at least ^j — |g coordinates. 
Therefore there are — 1£ linear forms among /x„2 +1 , . . . , /i n 2 +r that vanish on ai — 03 and 02 — 03. 
In addition there are 77 2 — 2( 3 )n linear forms among /ii,...,/j n 2 that vanish on 01,02,03, hence 
there are 77 2 — 2(g) 77 linear forms among /ji, . . . , /j n 2 that vanish on 01 — 03 and on 02 — 03. Let 

O = Ol — O3, b = 0,2 — 03. 

We get that there are — || linear forms among /j n 2 +1 , . . . , f^ n 2 +r that vanish on J, o, 6. Since 

n 2 — 0(ri3) of the first n 2 /jj's vanish on /, we get that at least n 2 — 2(g)n — O(ns) of the first n 2 
fa's vanish on I, a, b. Putting it all together we get that at least 

n 2 -2(f)n-0(n§) + 4-^ 
V 6 J p A pk 

2 5 1 

linear forms among . . . , /x m vanish on /, a, b. Since r = 0(77,3), and k = ni, we get 

that at least 

1x9 m ^/ 7. 
1 - -^)n 2 + — -0(n*) 



of the //j's vanish on them. This concludes the proof of lemma 11. 4k 



Putting everything together we get by lemma |9| and lemma 11 that: 

m > 1.5tT + (1 =)n z + - 0(n*) . 

Therefore 

15 7 
m > (2.5 + — )n 2 - Ofra*) . 
p 6 — 1 

This concludes the proof of theorem 0. 
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